Add geneLME: parallel per-gene LME with flexible contrast specification#2
Open
Add geneLME: parallel per-gene LME with flexible contrast specification#2
Conversation
Introduces geneLME(), a parallelised per-gene linear mixed effects modelling function built on lme4/emmeans, with full contrast support and benchmarking against kimma::kmFit. Key features: - geneLME_contrast_spec(): pre-run helper to enumerate available contrast levels and build contrast_spec / contrasts_primary arguments - geneLME_build_contrast_args(): pre-computes Branch A contrast vectors once before parallel dispatch (eliminates per-gene rebuilds) - Branch A: explicit pairwise interaction contrasts via contrast_spec - Branch B: named contrast vectors on marginal means with optional 'by' grouping - Second-order contrasts (contrasts-of-contrasts) in both branches - Singular fit flagging: model_status = "singular_fit" instead of hard stop; results returned for all genes, filter downstream on model_status - Soft-fail on wrong-length contrasts_secondary with indexed $contrast_spec returned for debugging - FDR adjustment within term (ANOVA) or contrast x order (contrasts) - Warning suppression: lmer() rescaling + package version messages silenced - Pre-flight input validation with informative errors (11 checks) Benchmarked vs kimma::kmFit (2,000 genes, 8 cores, 5 reps): - 100% direction agreement with kimma across all contrast pairs - ~1.8x faster than kimma at 3-6 contrasts; equal speed at 66 - Estimate r=1.0, MAD~0 vs kimma after direction correction Includes: test suite, tutorial (Rmd + HTML), function overview (md + HTML), benchmark reports v1 and v2 (Rmd + HTML), dev history (geneLME_dev.R, geneLME_dev2.R), and CLAUDE_NOTES session log. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
geneLME(), a parallelised per-gene linear mixed effects modelling function built onlme4+emmeans, designed as a drop-in alternative tokimma::kmFitwith more flexible contrast specificationcontrast_specdata frame) and Branch B (named contrast vectors on marginal means with optional grouping)model_status = "singular_fit"flag instead of hard stop — results returned for all genesgeneLME_build_contrast_args()) eliminate per-gene rebuilds in parallel workerskimma::kmFit: 100% direction agreement, ~1.8× faster at 3–6 contrasts, equal speed at 66Files added
geneLME.RgeneLME_test.RgeneLME_tutorial.Rmd/.htmlgeneLME_function_overview.md/.htmlgeneLME_benchmark.Rmd/.htmlgeneLME_benchmark2.Rmd/.htmlgeneLME_dev.R,geneLME_dev2.RCLAUDE_NOTES_geneLME.mdTest plan
geneLME.Rand rungeneLME_test.R— all 7 tests should pass (Branch A, Branch A with second-order, Branch B, validation errors 6a–6f, soft-fail 6g)geneLME_tutorial.Rmdto verify tutorial renders cleanlygeneLME_benchmark2.htmlfor sign consistency and speed results vs kimma🤖 Generated with Claude Code